Data-reuse exploration under an on-chip memory constraint for low-power FPGA-based systems

نویسندگان

  • Qiang Liu
  • George A. Constantinides
  • Kostas Masselos
  • Peter Y. K. Cheung
چکیده

Contemporary FPGA-based reconfigurable systems have been widely used to implement data dominated applications. In these applications data transfer and storage consume a large proportion of the system energy. Exploiting data reuse can introduce significant power savings, but also introduces the extra requirement for on-chip memory. To aid data reuse design exploration early during the design cycle, we present an optimization approach to achieve a power-optimal design satisfying an on-chip memory constraint in a targeted FPGA-based platform. The data reuse exploration problem is mathematically formulated and shown to be equivalent to the Multiple-Choice Knapsack Problem (MCKP). The solution to this problem for an application code corresponds to the decision of which array references are to be buffered on-chip and where loading reused data of the array references into on-chip memory happens in the code, in order to minimize power consumption for a fixed on-chip memory size. We also present an experimentally verified power model, capable of providing the relative power information between different data reuse design options of an application, resulting in a fast and efficient design space exploration. The experimental results demonstrate that the approach enables us to find the most power efficient design for all the benchmark circuits tested.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FPGA Implementation of a Hammerstein Based Digital Predistorter for Linearizing RF Power Amplifiers with Memory Effects

Power amplifiers (PAs) are inherently nonlinear elements and digital predistortion is a highly cost-effective approach to linearize them. Although most existing architectures assume that the PA has a memoryless nonlinearity, memory effects of the PAs in many applications ,such as wideband code-division multiple access (WCDMA) or orthogonal frequency-division multiplexing (OFDM), can no longer b...

متن کامل

Data Reuse and Parallelism in Hardware Compilation

This thesis presents a methodology to automatically determine a data memory organisation at compile time, suitable to exploit data reuse and loop-level parallelization, in order to achieve high performance and low power design for data-dominated applications. Moore’s Law has enabled more and more heterogeneous components integrated on a single chip. However, there are challenges to extract maxi...

متن کامل

Multiprocessor System-on-Chip Data Reuse Analysis for Exploring Customized Memory Hierarchiesv2

The increasing use of Multiprocessor Systems-on-Chip (MPSoCs) for high performance demands of embedded applications results in high power dissipation. The memory subsystem is a large and critical contributor to both energy and performance, requiring system designers to perform exploration of low power memory organizations. In this paper we present a novel multiprocessor data reuse analysis tech...

متن کامل

Performance Analysis and Implementationof Predictable Streaming Applications onMultiprocessor Systems-on-Chip

Driven by the increasing capacity of integrated circuits, multiprocessor systems-on-chip (MPSoCs) are increasing widely used in modern consumer electronics devices. In this thesis, the performance analysis and implementation methodologies of predictable streaming applications on these MPSoCs computing platforms are explored. The functionality and application concurrency are described in synchro...

متن کامل

Constructing Application-Specific Memory Hierarchies on FPGAs

The high performance potential of an FPGA is not fully exploited if a design suffers a memory bottleneck. Therefore, a memory hierarchy is needed to reuse data in on-chip buffer memories and minimize the number of accesses to off-chip memory. Buffer memories not only hide the external memory latency, but can also be used to remap data and augment the on-chip bandwidth through parallel access of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IET Computers & Digital Techniques

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2009